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ABSTRACT 


Robust  methods  provide  a  fresh  approach  to  the  treatment  of  outliers 
or  wild  observations  in  filtering  and  smoothing  applications.  The  robust 
M-estimates  of  regression  are  extended  to  filtering  and  fixed  lag  smoothing 
by  employing  a  pseudo  density  of  the  observations  In  the  derivations  of  the 
filter  and  fixed  lag  smoother.  These  robust  methods  are  applied  to  track¬ 
ing  data  to  obtain  improved  estimation  performance  in  the  presence  of  wild 
observations.  The  improvement  in  estimation  performance  is  evaluated  via 
Monte  Carlo  using  simulated  tracking  data. 
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I.  INTRODUCTION 


Robust  filtering  and  smoothing  are  a  natural  extension  of  the  robust 
M-estimates  of  regression  which  have  been  developed  by  Huber  [1].  The  M-estimates 
of  regression  have  been  designed  to  perform  well  when  the  observations  are  con¬ 
taminated  by  outliers.  The  conventional  estimation  techniques  of  least  squares, 
minimum  variance,  maximum  likelihood,  etc.  may  become  useless  when  the  observations 
are  contaminated  by  outliers.  The  robust  M-estimates  have  been  extremely  successful 
in  dealing  with  outliers  In  other  data  reduction  problems,  [2],  Outliers  or  wild 
data  also  present  a  problem  when  they  occur  in  the  observation  sequence  of  a  filter 
or  smoother.  These  outliers  have  often  been  treated  by  testing  the  filter  or  smoother 
residuals.  If  the  residual  is  too  large  as  compared  with  some  measure  of  dispersion 
of  the  residuals,  the  corresponding  observation  was  rejected  as  being  an  outlier. 
Otherwise,  the  observation  is  processed  normally  by  the  filter  or  smoother.  This 
procedure  was  often  successful  if  only  a  small  number  of  outliers  were  present  but 
may  breakdown  when  a  larger  proportion  of  outliers  were  present  in  the  observation 
sequence.  Also,  in  order  for  such  an  outlier  detection  method  to  be  successful,  a 
robust  measure  of  dispersion  of  the  filter  or  smoother  residuals  was  necessary. 

These  old  methods  of  treating  outliers  in  filters  or  smoother  observations  were 
added  to  the  filter  or  smoother  process  as  an  afterthought.  In  contrast  to  this 
the  development  of  robust  filtering  and  smoothing  methods  by  the  use  of  M-estimates 
provides  a  method  of  treating  outlying  observations  which  Is  Inherent  in  the  filter 
or  smoother  equations. 

Very  little  development  has  appeared  on  the  application  of  robust  estimation 
to  filtering  and  smoothing.  The  most  significant  effort  known  to  the  author  is 
the  paper  of  Masrellez  and  Martin,  [3],  Their  development  on  the  application  of 
M-estimates  to  the  Kalman  filter  is  mainly  theoretical.  The  emphasis  in  this  re¬ 
port  will  be  on  the  development  of  some  practical  results  on  the  application  of 
M-estimates  to  robust  filtering  and  smoothing  and  the  evaluation  of  these  techniques 
for  real  and  simulated  tracking  data. 

The  M-estimates  for  regression  are  discussed  extensively  in  [1],  [2],  [4]  , 
and  [5],  The  following  description  provides  a  brief  introduction  to  these  robust 
regression  estimates.  Given  scalar  observations  y^,  i«l,n  of  the  linear  model, 

y1  ■  X^e  +  e1t  (1) 


r 


where  is  a  row  vector  of  known  independent  variables  and  e^  is  a  random 
error  tena.  We  want  to  estimate  the  p- vector,  e.  The  H-estiaate  of  e  mini  arizes 

J,  *  ((ji  -  xi9)/s )  •  (2) 


where  p( ■ )  is  a  specified  function  and  s  is  a  robust  Measure  of  the  dispersion 
of  the  residuals,  y^-X^e.  Minimizing  (2)  by  differentiating  with  respect  to 
e  leads  to 


An  example  of  the  redescending  \ p  Is  furnished  by  the  ^  function  proposed  by 
Hampel  [6].  The  Hampel  t  Is  given  by 


The  Hampel  1/  with  breakpoints  a,b,c,  which  we  sometimes  denote  by  Ha(a,b,c), 
is  the  only  $  function  we  will  consider  in  this  report.  It  is  a  very  versatile 
function  which  can  be  made  to  take  several  shapes  depending  on  the  choice  of  the 
breakpoings.  Besides  using  distinct  breakpoints  a,b,c,  we  also  find  the  col¬ 
lapsed  Hampel  ^'s,  Ha(a,a,a)  and  Ha(a,b,b)  to  be  useful  in  filtering. 

A 

Since  (3)  is  nonlinear,  0  must  be  computed  iteratively.  A  simple  but  highly 
effective  method  has  been  developed  for  the  iterative  solution  of  (3).  This 
method  is  merely  an  iterative  application  of  a  weighted  least  squares  algorithm, 
[2],  [5],  and  [7].  Starting  at  an  arbitrary  point  in  the  iteration  sequence 
©l  ,  0*K  '  is  computed  by 


(6)  is  easily  recognized  as  the  solution  of  a  weighted  least  squares  problem 

f  k> 

with  weights  Vr.  .  The  weights  are  computed  Iteratively  by 

J 


u{k) 

Wj 


♦  ((y,i  -  xie{k>)/sk) 

-  xj®i,a>'sk 


A  common  dispersion  measure  s^  is 


where 


s,  *  median 
k  i*l  ,n 


,{k)  . 


r:  -  y. 


(7) 


(8) 

(9) 


We  will  have  occasion  to  apply  this  method  of  iterative  weighted  least  squares 
to  smoothing. 

If  the  probability  density  function  p  of  the  measurement  errors  e^.  is  re¬ 
lated  to  4>  by  p'/p=-4>,  then  the  resulting  M-estlmate  is  maximum  likelihood.  If 
outliers  are  present  in  the  observations,  the  probability  density  function  should 
have  heavier  tails  than  the  usual  Gaussian  density.  For  any  41  function,  we  call 
p*e"p  a  pseudo  density.  In  the  case  of  the  Huber  4/,  the  pseudo  density  is  an 
acceptable  density  having  rather  heavy  tails.  A  plot  of  e"p  for  4»®Hu(l),  the 
Huber  4»  function  with  breakpoint  a*l,  is  given  in  Fig.  1.  For  a  Hampel  4»  function, 
Ha(a,b,c)  the  tails  are  so  heavy  that  e"p  is  no  longer  a  proper  density  function. 

A  plot  of  e"p  for  4>=Ha(l,2,3)  is  given  in  Fig.  2.  Although  the  pseudo  density  may 
not  be  a  proper  probability  density  function,  we  derive  some  filters  and  smoothers 
In  some  of  the  conventional  ways,  e.g.,  conditional  mean  and  maximizing  the  posterior 
density,  with  the  probability  densities  of  the  measurement  errors  replaced  by  pseudo 
densities. 
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II.  APPROXIMATE  NON-GAUSSIAN  FILTERS 

Assume  that  the  state  of  a  process  Is  given  by  the  linear  model 

x(k+l )  =  *(k+l,  k)  x(k)  +  u(k) ,  (10) 

where  the  state  vector  x(k)  of  the  process  is  an  m- vector,  u(k)  is  a  Gaussian 
state  noise  vector  with  zero  mean  and  covariance  Q(k).  $(k+l,  k)  is  an  mxm 

transition  matrix.  Let  scalar  observations  of  the  process  be  given  by 


z(k)  =  H(k)x(k)  +  v(k), 

where  H(k)  is  a  row  vector  and  v(k)  is  a  measurement  noise  error  which  may  be 
contaminated  by  outliers. 

Recursive  filters  have  been  derived  in  several  ways.  For  example,  Sherman 
[8]  has  shown  that  for  a  wide  variety  of  loss  functions  and  a  wide  variety  of 
distributions  underlying  the  observations  the  conditional  mean  of  the  posterior 
distribution  is  a  minimum  variance  estimate.  If  we  apply  this  idea  to  the  filtering 
case,  we  arrive  at  a  minimum  variance  filter.  Another  method  of  recursive  filter 
derivation  Is  by  maximizing  the  posterior  distribution  from  which  we  obtain  the 
MAP  estimates.  In  either  case  we  will  work  with  the  conditional  probability 

i  k  k  k 

density  function  p(x(k)|Z  ),  where  Z  denotes  the  set  of  observations,  Z  = 

|z(l),  z(2),  — ,  z(k)J  .  We  will  derive  robust  filters  by  both  the  conditional 
mean  and  MAP  estimates. 

Using  Bayes  rule  the  posterior  conditional  density  p(x(k)|Z  )  can  be 
written  as 


(11) 
k 

In  order  to  approximate  the  conditional  mean  E[x(k)|Z  ]  of  (11),  we  assume  that 
k-1 

p(x( k) | Z )  is  Gaussian  and  procede  as  in  Masreliez  [9], 


p(x(k)|Zk) 


:z(k)|x(k] 


:x(k)iz1' 


p(z(k)|Zk"1) 
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Let  x( k | k-1 )  be  the  mean  of  p(x(k) | Zk_1 )  and  let  P(k|k-1)  be  Its  covariance. 

A 

Adding  and  subtracting  x(k|k-l)  to  (12)  gives 

x(k|k)  =  x( k | k-1)  +  p-1(z(k)|Zk_1)  f  (x(k)  -  x(k| k-1))  p(x(k) | Zk“1 )p(z(k) | x(k) )  dx 

^m  (13) 

k-1 

Assuming  p(x(k)|Z  )  to  be  Gaussian,  we  can  write 

(x(k)  -  x(k|k-l)p(x(k)|Zk_1)  =  -P(k|k-1)  9  p(x(k)|Zk"1)  (14) 

9x(k) 


Then  we  can  rewrite  (13)  as 


x(k|k) 


x(k|k-l)  -  p-1(z(k)|Zk-1)P(k|k-l) 


p(x(k)|Zk-1) 


Integrating  (15)  by  parts. 


p(z(k)|x(k))  dx 
(15) 


-1 


,k-l 


nm 


,k-l. 


x ( k | k)  =  x ( k | k-1 )  +  p  (z(k)|Z  )P(k|k-l)  p(x(k)|Z*’')  8  p(z(k)|x(k))  dx 


^TT 


(16) 

(17) 


Using  _9  p(z(k)|x(k))  =  -HT(k)  a  p(z(k)lx(k)) 

9x(k)  3z(k) 

in  (16)  we  obtain 

x(k|k)  =  x(  k  |  k-1 )  -  p'1(z(k)|Zk"1)P(k|k-l)HT(k)_9  f  p(x(k)  |Zk_1)p(z(k)  |x(k))  dx 

^Rjm  08) 

Using  p(x(k)|Zk_1)p(z(k)|x(k))  =  p(x(k) ,z(k) jZk-1 ) ,  (18)  becomes 


x( k | k)  =  x( k | k-1 )  +  P( k | k-1 )H  (k)g(z(k)) 


(19) 


where  g(z(k))  is  the  scalar 

g(z(k))  =  -p"1 (z(k) |Zk_1 )  9P(z(k) IZ*'1 ) 

gzTio 


(20) 


(19)  and  (20)  were  derived  by  Mareliez  in  [9]  for  approximating  the  minimum 
variance  filter  when  measurement  noise  is  non-Gaussian.  In  order  to  complete 
the  specification  of  this  approximate  filter,  it  is  necessary  to  derive  an 
expression  for  the  conditional  second  moment, 

P(k|k)  =  E  [  (x(k)  -  x(k | k) )  (x(k)  -  i(k|k))T  Zk  j 

A 

An  expression  for  P(k|k)  is  derived  similar  to  the  derivation  for  x(k|k).  This 
derivation  is  outlined  below. 


P(k|k)  =  E  [  (x(k)  -  x(k| k-1 ) )  (x(k)  -  x(k| k-1 ))’|Zk  ] 

-  (x(k|k)  -  x(k|k-l))  (x(k)  -  x(k| k-1 ))T 
Let  S(k)  =  e[  (x(k)  -  x( k | k-1 ) )  (x(k)  -  x(k | k-1 ))T|Zk  j.  Then  using  (11) 

S(k)  =  p”^ (z(k) |Zk"^ )  f  (x(k)  -  i(k|k-l))  (x(k)  -  i(k|k-l))Tp(x(k)|Zk-1) 

•4m 


(21) 


(22) 


p(z( k) | x(k) )  dx 


»k-l' 


Assuming  p(x(k)|Z  )  Gaussian,  we  use  (14)  and  integrate  by  parts  twice  to 
obtain. 


S(k)  =  P( k | k-1 )  +  P( k | k-1 )HT( k) 


p*1 (z( k) I Z*~ 1 ) a‘p(z(k)  J  Z*~~ 1 ) 


rk-lx„2 


k-1. 


3z(k)‘ 


Combining  (23)  with  (21)  and  (19)  gives 

P(k|k)  =  P(k | k-1 )  -  P(k| k-1 )HT(k)6(z(k) )  H(k)P(k!k-l) 


H(k)P(k | k-1 ) 
(23) 


(24) 


where 


G(z( k) ) 


(25) 
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A  second  method  for  approximate  non-Gaussian  filtering  is  the  marginal 
maximum  likelihood  filter.  This  filter  has  also  been  called  the  maximum  a 

A 

posteriori  or  MAP  filter.  In  this  case  we  find  the  estimate  x(k|k)  which  maxi¬ 
mizes  (11).  Maximizing  (11)  is  equivalent  to  minimizing  -Log  p(x(k)|Zk).  Thus, 
the  MAP  estimate  minimizes 

L(x(k))  =  -log  p(x(k)|Zk-1)  -  log  p(z(k)|x(k))  (26) 

k— 1  a 

If  we  assume  p(x(k)|Z  "  )  is  Gaussian  with  mean  x(k|k-l)  and  covariance  P(k|k-1), 
(26)  becomes 

L(x( k) )  =  1  (x(k)  -  x(k|k-l))T  P“1(k|k-1)  (x(k)  -  x(k|k-l))-  log  p(z(k)|x(k)) 

I 

(27) 

Minimizing  (27)  by  setting  9L(x(k))  =  0  gives 

3x(k) 

x(k | k)  *  x( k | k-1 )  -  P( k | k-1 )HT(k)  9  log  p(z(k) lx(k|k))  (28) 

9z(k) 


The  MAP  filter  formulation  does  not  provide  estimates  of  any  of  the  moments  of 
the  density  p(x(k)|Z  )  but  only  the  mode  of  the  density.  Thus,  in  order  to  con¬ 
tinue  the  MAP  filter  from  point  to  point,  propagation  of  the  first  two  moments 

L 

of  p(x(k)|Z  )  in  time  must  be  provided  by  a  different  formulation.  In  other  words 
the  computation  of  the  MAP  estimate  x(k+l|k+l)  requires  knowledge  of  the  moments 
x(k+l|k)  and  P(k+l|k)  of  p(x(k+l)|Zk)  which  are  not  available  from  the  MAP  formu- 

A 

lation.  We  compute  values  of  the  moments  x(k+l|k)  and  P(k+l|k)  from  the  conditional 
mean  formulation  given  previously  and  consider  the  MAP  estimate  of  (28)  to  be  a 
correction  to  the  conditional  mean. 


III.  ROBUST  CONDITIONAL  MEAN  FILTER 


We  apply  (19),  (20),  and  (24)  to  derive  a  robust  filter  based  on  the 
M-estimates.  Given  a  ^  function  for  an  M-estimate,  we  replace  the  density 
p(z(k)|Zk_1)  In  (20)  by  the  pseudo  density  to  obtain  a  robust  filter.  Thus 

LI  _ 

we  use  p(z(k)|Z  )  =  e“p  where  ^  i s  the  derivative  of  p.  With  this  substitution 
the  filter  estimate  of  (19)  becomes 

x(klk)  *  x(klk-l)  +  P(k|k-l)HT(k)  *  (z(t)  -  H(k)x(k|k-1)  \  (29) 

h  \  h  J 

The  equation  for  the  second  moment  given  by  (24)  becomes 

P(k|k)  =  P(k | k-1 )  -  r£k]_ j Z'j  P(k|k-l)HT(k)H(k)P(k|k-l),  (30) 

where  ^ *  is  the  derivative  of  ip  and  r(k)*z(k)-H(k)x(k|k-l)  is  the  predicted  re¬ 
sidual.  The  filter  equations  are  completed  by  the  equations  for  the  predicted 
moments , 


x(k+l|k)  «  *(k+l,k)x(k|k)  (31) 

P(k+1 1 k)  =  <*>(k+l ,k)P(k|k)$T  (k+l,k)  +  Q(k)  (32) 


In  order  to  insure  the  robustness  of  the  filter  of  (29)-(32),  the  dispersion 
s^  of  the  predicted  residuals  must  be  specified  so  that  It  is  Insensitive  to 
outliers.  We  use  the  MAD  estimate  of  sk  computed  from  past  residuals  as 


sk 


median  |z(k-j)-HT(k-j)x(k-j|k-j-l)| 
j*0,N-l 


(33) 


where  N  is  a  suitably  chosen  integer.  The  robust  conditional  mean  filter  does 
not  require  iteration  as  was  required  for  the  M-estimates  of  regression.  In 
this  sense  the  conditional  mean  filter  corresponds  to  the  one  step  M-estimates 
described  by  Bickel  [9]. 
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IV.  ROBUST  MAP  FILTER 

As  discussed  previously  the  Non-Gausslan  MAP  filter  merely  provides  a 
correction  to  the  conditional  mean  filter.  In  the  following  we  let  x(k|k)  denote 
the  MAP  estimate  and  x(k|k)  the  approximate  mean  of  p(x(k)|Zk).  Given  a  ^ 
function  for  an  M-estlmate,  we  replace  the  density  p(z(k) |x(k|k))  In  (28)  by  the 
pseudo  density  to  obtain  the  robust  MAP  estimate.  Thus,  we  use  p(z(k) |x(k|k))  *  e”p 
where  *  Is  the  derivative  of  p.  With  this  substitution  the  MAP  estimate  of  (28) 
becomes 


x(k | k)  -  x(k|k-l)  +  P(klk-l)HT(k)  */z(k)  -  H(k)x(klk) 


(34) 


To  complete  the  description  of  this  robust  filter  we  use  the  conditional  moments 
x(k+l|k)  =  4>(k+l,k)x(k|k)  (35) 


x( k | k)  =  x(k|k-l) 


and 


P(k|k)  =  P(k|k-1)  - 


+  P ( k I k-1  )HT(k)  ip  fz( k)  -  H(k)x(klk-1)\ 

Sk  l  Sk  / 


(•'  ?  /■'•) 


(36) 


P(k|k-l)HT(k)H(k)P(k|k-l)  (37) 


where  r(k)  =  z(k)  -  H(k)x(k | k-1 ) 

also  P(k+l|k)  *  4>(k+l,k)P(k|k)»T(k+l,k)  +  Q(k)  (38) 


Again  we  use  the  MAD  estimate  on  past  residuals 


sk 


median  |z(k-j)-H(k-j)x(k-j 
j*0,N-l 


k-j )  | 


.6745 


(39) 


Note  that  the  robust  MAP  estimate  specified  by  (34)  requires  Iteration  for 
solution  since  x(k|k)  appears  nonlinearly  on  the  right  hand  side  of  (34).  Several 
simple  methods  for  iteration  are  readily  apparent.  The  first  method  is  to  Iterate 
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(a) 

(34)  directly.  Let  x(k|k)  be  an  arbitrary  point  in  the  Iteration  sequence 

-(o 

and  let  x(k|k)  *  x(k|k-l).  Then  we  iterate  (34)  as 

-(a+1)  _  T  -(a) 

x(k|k)  =  x(k|k-l)  +  P(kl k-1  )H  (k)  *(z(k)  -  H(k)x(klk)\  (40) 

sk  V  sk  1 

In  this  case  we  can  replace  (35)  by 

.(1) 

x(k+l|k)  =  4>(k+l  ,k)x(k| k)  (41) 


Another  simple  method  for  Iteratively  solving  (34)  is  similar  to  the  Iterative 

weighted  least  squares  method  used  to  solve  the  robust  regression  equations.  If 

-(a) 

x( k | k)  is  an  arbitrary  point  in  the  iteration  sequence,  this  method  solves  (34) 
iteratively  as 

xikjk)  ■  x(k|k-l)  +  w£o)  P(ci)(k)  HT(k)  (z(k)  -  H(k)x(k|k-1)),  (42) 

/  \  I  \  »  (0)  _ 

where  W£“;  is  a  scalar  weight,  0<W£a'<l  and  x(k|k)-x(k|k-l) 


-(a)  V 
z(k)  -  H(k)x(klk)  ] 

,  sk  1 

z(k)  -  H(k)x(kl k) 
sk 


(43) 


and 

P(o)(k) 


P”1 (k| k-1 ) 


HT( k)H( k) 


(44) 


The  filter  correction  provided  by  (34)  to  the  conditional  mean  filter  can  be 

A 

quite  sizeable.  A  change  to  the  prediction  equation  which  uses  x(k|k)  on  the 

»(1 ) 

right  hand  of  (41)  rather  than  x(k|k)  might  be  a  desirable  in  some  applications. 
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V.  APPROXIMATE  NON-GAUSSIAN  SMOOTHING 

In  the  following  some  robust  fixed  lag  smoothing  equations  are  derived 

in  a  similar  manner  to  the  derivation  of  the  robust  filter  equations,  i.e., 

using  the  conditional  mean  and  MAP  formulations.  In  fixed  lag  smoothing  an 

estimate  of  the  state  x(k)  of  the  system  described  by  (10)  and  (11)  is  desired 

using  the  measurements  z(l),  z(2),  — ,  z(k),  z(k+l),  — ,  z(k+N).  Thus,  there 

k+N 

s  a  lag  of  N  points  in  obtaining  this  smoothed  state  estimate.  Let  aZ  « 
z(k+l),  z(k+2),  — ,  z(k+N)  J  .  Then  Zk+N  =  ZkUAZk+N.  The  posterior  conditional 
desnity  p(x(k)|Zk+N)  is  given  by 

p(x(k) IZk+l*)  ■  p(aZk+Nlx(k))  p(x(k)IZk)  (45) 

p(4Zk;V> 

1/ 

We  assume  that  the  density  p(x(k)|Z  )  is  Gaussian  and  procede  as  before  to 
obtain  the  conditional  mean,  x(k|k+N)  =  x(k)|Zk+N  j. 

The  conditional  mean  is  given  by 

x( k | k+N)  =  p_1(AZk+N|Zk)  f  x(k)  p(AZk+N|x(k))  p(x(k)|Zk)  dx(k)  (46) 

{m 

-  K  k 

Adding  and  subtracting  x(k|k)  to  (46)  and  assuming  that  p(x(k)|Z  )  is  Gaussian 

gives 

x(k|k+N)  =  x(k|k)  -  p"1 (AZk+N|Zk)  P(k|k) J 


a  ,  p(x(k) |ZK)  p(AZK+N|x(k))  dx(k) 
ax(k) 


where  P(k|k)  is  the  covariance  of  p(x(k)|Z  ).  Integrating  by  parts 


x(k|k+N)  =  x(k | k)  +  p'1(AZk+N|Zk)  P(k|k) 


/[ 


_a  p(AZK+N|x(k))l  p(x(k) |ZK)  dx(k) 
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Assuming  statistical  Independence  of  the  observations 


p(AZK  N|x(k) )  =  n  p(z(k+j) |x(k) ) 
J  • 


P(izk+N|x(k>)  x 


where 


A1  so. 


0  s  n  p(z( k+1 ) | x ( k ) ) 


(llk+jUxML*  -*T(tt+,  t.)  HT(k+j)  ap(z(k+.i)lx(k 


ax(k)  k+j»  k  32(bj)  M 

Substituting  (49)-(52)  in  (48) 

A  a  N 

x(k|k+N)  *  x(k|k)  +  p_1(AZk+N|Zk)  P(k|k)  l  4T(t.  .  t. ) 

j-1  K+J»  k 


^(k+j)  a  f 
azlk+j) 


pUZk+N|x(k))p(x(k)|Zk)  dx( k) 


p(AZk+N|x(k))p(x(k)|Zk)  =  pUZk+N,  x(k) |Zk) 
Substituting  (54)  into  (53)  yields 


x(k|k+N)  =  x(k|k)  -  P(k|k)  *T(tk4j<  tk)  HT(k+j)  p-1(z(k+j) |Zk)  ap(z|k+jj lzkl 
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The  filtered  estimate  x(k|k)  and  P(k|k)  are  given  by  (19)  and  (24). 

A  slightly  different  robust  fixed  lag  smoother  is  obtained  if.  Instead 

k+N 

of  finding  the  conditional  mean  ofp(x(k)|Z  ),  we  compute  the  mode  of 
p(x(k) |Zk+N).  We  will  call  this  method  the  marginal  maximum  likelihood  or 
the  MAP  formulation  of  the  robust  smoother.  This  estimate  maximizes  (45)  or 
equivalently  minimizes 

L(x(k) )  =  -log  pUZk+N|x(k))  -  log  p(x(k)|Zk)  (56) 

k  A 

Assuming  that  p(x(k)|Z  )  is  Gaussian  with  mean  x(k|k)  and  covariance  P(k|k) 

(56)  becomes, 

L(x(k))  =  1/2  (x(k)  -  x(k|k))T  P‘\k\k)  (x(k)  -  x(k|k))  -  log  p(AZk+N|x(k))  (57) 
Minimizing  (57)  by  setting  3L(x(k))  =  0  gives 

3X(k) 

x(k|k+N)  *  x(k|k)  +  P(k|k)  p“1(AZk+N|x(k))  3p(AZk^|x(k))  (58) 

Assuming  statistical  independence  of  the  observations,  we  use  (49)  and  (52) 
in  (58)  to  obtain 

x(k|k+N)  =  x(k|k)  -  P(k|k)  l  *T(tk+j,  tk)  HT(k+j)  p_1(z(k+j) |x(k|k+N))  (59) 

J=1 

3p(z(k+,i)  lx(klk+N)) 

3x(k|k+N) 

x( k | k)  and  P(k|k)  are  obtained  via  (19)  and  (24).  The  MAP  formulation  of  (59) 
can  be  viewed  as  a  correction  to  the  conditional  mean  formulation  of  (55).  Since 

A 

x(k|k+N)  appears  nonllnearly  on  the  right  hand  side  of  (59),  the  solution  of  (59) 
will  require  Iteration. 


VI.  ROBUST  CONDITIONAL  MEAN  SMOOTHING 

In  order  to  obtain  a  robust  fixed  lag  smoother  via  the  conditional  mean 
formulation,  we  replace  the  densities,  p(z(k+j)|Zk)  In  (55)  by  pseudo  den- 
sities.  Thus,  given  a  *  function  for  an  M-estlmate,  we  use  p(z(k+j)|Z  )  *  e 
where  *  is  the  derivative  of  p.  With  this  substitution  the  robust  smoothing 
equation  of  (55)  becomes 


x(k|k+N)  =  x( k | k)  +  P(k|k)  l 


*T  (t. 


tk)  H  ( k+ j ) 


^z(k+j)  -  H(k+j)  »(tk+i<  tk)  x(k|k)\ 

*  \  sk+j  ' 

No  iteration  is  required  for  the  computation  in  (60)  so  that  this  estimate 
corresponds  to  the  one  step  M-estimate. 

In  order  to  insure  the  robustness  of  the  estimate  in  (60),  It  is  necessary 
that  sk+j  in  (60)  be  a  robust  measure  of  dispersion  of  the  residuals,  r(k+j)  = 
z(k+j)  -  H*(tk+j  tj)  x(k|k).  Several  alternatives  are  possible  for  computing 
sk+j*  depending  on  the  assumptions  made  about  the  statistics  of  the  residuals. 

The  simplest  method  for  computing  sk+J.  is  to  assume  it  is  independent  of  j.  In 
this  case  we  can  use  the  same  estimate  (39)  as  the  filter.  Another  possible  es¬ 
timate  when  assuming  sk+J.  to  be  independent  of  j  is 

sk+j  =  sk  *  "^dian  l2(k+j)  "  H(k+j)  ^k-l+j,  ^-1^  x(k_1 1 k-l+N)  |  /^6745  (61) 

J  1 ,  N  1 

If  an  estimate  sk+J.  Is  required  for  each  j,  then  values  of  the  forward  resi¬ 
duals  must  be  saved  for  past  values  of  k.  In  this  case  we  might  use 

sk+j  =  median|z(k-£+j)  -  H(k-£+j)  *(tk_^+j  x(k-£|k-£+N) I  j .6745  (62) 

where  Q  is  a  suitable  integer. 
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VII.  ROBUST  MAP  SMOOTHING 

To  obtain  a  Robust  fixed  lag  smoother  from  the  MAP  smoothing  formulation 

of  (59),  the  densities  p(z(k+j)|Z  )  are  replaced  by  pseudo  densities, 
k  -o 

p(z(k+j)|Z  )  =  e  ,  where  the  derivative  of  p  Is  the  Influence  function 
for  an  M-estimate.  With  this  substitution  (59)  Is 


N  T 

x(kik+N)  *  x(kik)  +  p(kik)  y  *  (tk+.i, 

j*i 


tk)  H  ( k+ j ) 


*k+j 


/z(k+j)  -  H(k+j)  *(tk+i>  tk)  x(k| k+N)\ 

V  Vj  ' 


(63) 


Again  sk+J.  Is  a  robust  measure  of  dispersion  of  the  residuals,  r(k+j)  *  z(k+j)  - 
H(k+j)  ®(tk+J.  tk)  x(k|k+N).  Since  x(k|k+N)  appears  nonllnearly  on  the  right 
hand  side  of  (63),  iteration  is  required  for  its  solution.  The  most  obvious 

A 

method  is  to  iterate  (63)  In  its  present  form,  i.e.,  to  replace  x(k|k+N)  on  the 
right  hand  side  of  (63)  by  x^(k|k+N)  and  x(k|k+N)  on  the  left  side  of  (63)  by 
X(a+T)(k|k+N).  This  iteration  is  generally  unstable  and  therefore  unuseable. 

A  much  better  iteration  procedure  for  (63)  is  to  use  weights  in  a  manner  simi¬ 
lar  to  the  iterated  weighted  least  squares  of  (6)  and  (7)  used  to  obtain  the 
M-estimates  of  regression.  Using  this  procedure  the  iterative  solution  of  (63) 
is  defined  by 

x(a+1)(k|k+N)  =  x(k|k)  +  P^(k)  J  *T(tk+.i,  V 

sk?j 


j*l 


(z(k+j)  -  H(k+j)  *(tk+^  tk)  x(k|k)) 


where 

D(a) 


(k)  = 


-i  5  u{°0 

P  '(k|k)  +  J  X. 

J-1 


‘T(Vj,  V  HT(k+j)H(k+j) 


(64) 


tk)]  1  (65) 
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The  iteration  starts  with  x(k|k+N)  *  x(k|k).  The  filtered  quantities  x(k|k) 
and  P( k | k)  are  obtained  from  (29)  and  (30).  The  iteration  specified  by  (64)-(66) 
usually  converges  very  rapidly,  most  often  in  two  or  three  cycles. 


VIII.  EVALUATION  OF  ROBUST  FILTERING 

Evaluation  of  the  robust  filtering  methods  described  here  has  been  done 
with  a  view  toward  eventual  application  to  trajectory  estimation.  Emphasis  in 
the  evaluation  is  on  tn°  use  of  simulated  rather  than  real  trajectory  data.  This 
allows  a  quantitative  determination  of  any  advantages  in  the  use  of  robust  filter¬ 
ing  in  the  presence  of  outliers  and  also  any  loss  in  efficiency  using  robust  meth¬ 
ods  when  no  outliers  are  present.  The  simulated  trajectory  is  that  of  a  constant 
velocity,  level  flying  aircraft.  The  filter  model  assumes  the  trajectory  to  have 
constant  acceleration  in  three  cartesian  coordinates.  Let  x,  y,  z  be  the  East, 
North,  and  Up  components  of  trajectory  position.  We  assume  that  the  dynamic 
model  for  each  of  the  coordinates  is  given  by 


”i(k  +  1) 

T  a  a277 

xiOO 

“o 

x2(k  +  1) 

= 

0  1  A 

x2(k) 

+ 

0 

_Xa(k  +  1J_ 

0^  0  1 

_x3(iO 

_w(ki 

whereA*  t^+^  -  t^  so  that  Xj(k),  x2(k),  and  x3(k)  are  position,  velocity,  and 

acceleration  components,  respectively.  w(k)  is  a  zero  mean  Gaussian  acceleration 
state  noise  with  variance  q.  The  filter  observations,  z(k)  are  scalar  positions 
corrupted  by  additive  noise. 


z(k)  =  Hx(k)  +  v(k)  ( 68) 

with  H  =  [1  0  Oj.  The  measurement  noise  v(k)  is  Gaussian  with  covariance  R(k). 
Two  different  methods  are  used  to  generate  outliers  in  the  observations.  One 
method  is  to  choose  the  variance  R(k)  of  the  noise  v(k)  as 


R(k) 


Ri(k)  if  no  outlier 
R2 ( k)  if  outlier  present 


(69) 


with  R2(k)  »  Ri(k).  In  this  case  the  mean  of  the  measurement  noise  is  zero. 
The  second  method  used  to  generate  outliers  in  the  observations  is  to  choose 
the  meanji(k)  of  the  observation  noise  as 
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A»U) 


0  if  no  outlier 

/J i(k)  if  outlier  present 


(70) 


In  this  case  the  variance  of  the  observation  noise  is  Ri(k).  In  either  method 
we  must  decide  whether  or  not  an  outlier  is  to  be  present  in  the  data  at  each  time 
t^.  We  do  this  by  using  a  two  state  Harkov  chain.  Let  i  denote  the  state  of  the 

Harkov  chain,  i  =  1  is  the  state  of  no  outlier  present  and  i  =  2  is  the  state  if 
an  outlier  is  present  in  the  data.  Let  P.  .(k)  be  the  probability  of  a  transition 

*  J 

from  state  i  to  state  j  in  the  interval  (tk_-|,  tk).  The  transition  probabilities 

are  chosen  to  provide  a  given  percentage  of  outliers  and  desired  run  lengths  of 
outliers  in  the  observations.  The  transitions  between  states  are  realized  by  use 
of  a  pseudo  random  number  generator. 

The  constant  velocity  trajectory  used  for  evaluation  is  given  by 

x(W  ’  xltk> +  ;<Vi  -  V 

y(tk+])  =  y(tk)  +  yUk+1  -  tk) 

2'tk+l)  “  2*tk*  +  z^tk+l  '  V  I7’) 

with  x  =  -550ft/sec,  y  =  -252ft/sec,  and  z  =  0.  A  sampling  interval  of  tk+^  -  tk  = 

.05  sec  was  used,  a  monte  carlo  evaluation  of  the  robust  filter  is  done  by  comput- 

ing  some  statistics  of  the  filtering  errors  over  N  filter  runs.  Let  x^t^), 

*  *  th 

yi ( tk) ,  and  z^(tk)  denote  the  fi  :erf:d  position  estimates  at  time  tk  for  the  i— 

run  and  let  xi ( tk )  =  x^^)  -  x(tk),  ^ ( tk)  =  y^- ( tk)  -  y(tk),  and  z^tj,)  « 

^  i,L 

z ^ ( tk )  -  z(tk)  denote  the  errors  in  the  filtered  positions  for  the  i—  filter  run 

at  time  tfc.  Also,  let  x.(tk)  =  x.(tk)  -  x,  y^)  *  y^t^  -  y,  and  z^)  = 
a  'v 

•  •  ••  M 

Zj(tk)  -  z  denote  errors  in  filtered  velocity  estimates  and  x^(tk)  =  Xj(tk)  "  x» 

••  ••  ••  ••  ••  •• 

y^ ( tk)  *  y^(tk)  -  y,  and  z.j(tk)  =  z^(tk)  -  z  be  the  errors  in  the  filtered  acceler¬ 
ations  for  the  i—  run  at  time  tk<  We  evaluate  only  the  RSS  position,  velocity, 
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r 


and  acceleration  estimates. 


2  2  2  1 J 

MV  ■  (*,  <‘k> +  *1  <V  +  *i  <V)  2 


/CV  2  *  2  *  2  \  V2 

VV  s  vxi  +  yi  +  Zi  ) 

We  compute  the  sample  averages  of  the  rss  errors  defined  in  (72). 


(72) 


*<V  *  if  j,  MV 
*<V '  *,*,  MV 

*<V  *  fi  MV  *73' 

In  order  to  reduce  the  evaluation  of  the  robust  filter  to  the  comparison 
of  only  a  few  numbers,  we  compute  the  time  average  of  estimation  errors  in 
(73) 

=  .  M 

R  -  TT  l  R(tu) 

Mk=l  k 


=  ,  M  - 

ft  =  IT  I  ft(tj 

Mk=l  k 
M  _ 


(74) 


k=l 


where  M  is  the  total  number  of  filtered  points.  Unless  otherwise  specified. 


N  *  25,  (k)  =  20  ft.,  Pi2(k)  *  -05,  and  P2i(k)  s  .5.  These  transition 

probabilities  for  the  Markov  chain  provide  an  outlier  contamination  of  about 
9%  and  an  average  outlier  run  length  of  three.  A  state  noise  covariance, 
q  =  5,  was  used  for  all  filter  runs.  In  all  runs  of  the  robust  filter  the 
measurement  noise  covariance,  Ri(k),  was  unknown  to  the  filter.  The  residual 
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variance,  which  is  the  only  quantity  required  by  the  filter  which  involves 
Rj(k),  was  estimated  using  the  MAD  estimate  of  (33). 

Figure  3  compares  the  average  RSS  position  errors  for  the  two  robust 
bilters  using  the  Hampel  $  functions  Ha(4,  4,  4)  and  H_ (2,  3,  4).  Figure 
4  gives  the  RSS  velocity  error  comparison  for  the  same  two  filters.  Also 
indicated  in  Figures  3  and  4  are  the  Ideal  RSS  error  values  which  were 
obtained  using  an  ordinary  Kalman  filter  with  no  outliers  present  and  using 
a  known  measurement  covariance,  Ri(k)  =  400.  We  note  from  Figures  3  and 
4  that  neither  of  the  robust  filters  lose  much  efficiency  from  the  ideal 
values  when  no  outliers  are  present.  In  Figures  3  and  4  the  magnitude  of 

the  outlier  contamination  is  yj(k)  =  C  •  /R^k)  for  various  values  of  C. 

The  error  curves  in  Figures  3  and  4  behave  as  expected.  Since  outliers 
small  in  relation  to  the  measurement  noise  are  hardest  to  detect,  the 
error  curve  rises  sharply.  Outliers  large  relative  to  the  measurement 
noise  are  easy  to  detect  so  that  the  error  curve  returns  to  zero  for  large 
outliers.  We  see  from  Figures  3  and  4  that  H_ (2,  3,  4)  has  a  significantly 

a 

smaller  estimation  error  than  H  (4,  4,  4).  Except  for  the  way  in  which 

G 

the  dispersion  of  the  residuals  is  measured,  i.e.,  the  MAD  estimate  of  (33), 
H  (4,  4,  4)  is  a  conventional  way  of  handling  outliers  in  a  Kalman  filter 

d 

application.  Using  H  (4,  4,  4)  any  observation  whose  predicted  residual 

O 

is  greater  than  4  •  is  not  processed  and  any  observation  whose  predicted 
residual  is  less  than  4  •  Sk  is  processed  as  an  ordinary  Kalman  filter  ob¬ 
servation. 


We  can  reduce  the  RSS  estimation  errors  for  the  robust  filter  still  farther 
by  pulling  In  the  breakpoints  of  the  Hampel  #  function.  Figures  5  and  6 
compare  the  RSS  position  and  velocity  estimation  errors  for  the  filters 
H#(2,  3,  4)  and  Ha(l,  2,  3).  He  note  that  Ha(l.  2,  3)  results  In  a  con¬ 
siderable  decrease  In  estimation  error  from  H.(2,  3,  4). 

Q 


OUTLIER  LEVEL 
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The  Iterated  filter  or  MAP  ftlter  given  In  (34)  -  (38)  mas  evaluated 
under  the  same  conditions  as  the  condtttonal  mean  filter.  Comparison  of 
the  average  RSS  estimation  errors  for  the  conditional  mean  and  MAP  filters 
showed  no  discernible  differences.  Because  the  MAP  filter  offers  no  Improve¬ 
ment  over  the  conditional  mean  filter  and  because  the  conditional  mean  filter 
Is  the  simpler  of  the  two  robust  filters,  we  will  not  give  any  further  eval¬ 
uation  of*  the  MAP  filter. 

The  robust  filter  was  also  evaluated  with  respect  to  Its  ability  to 
adapt  to  changes  in  measurement  noise  variance.  Using  the  same  simulated 
trajectory  as  before,  the  measurement  noise  standard  deviation  was  «'R1(k)  > 

20  ft  for  0  -  t  -  25  sec.,  ✓RjlkJ  »  100  ft  for  25  <  t  -  40  sec.,  and 

✓Rj(k)  ■  50  ft  for  40  <  t  -  50  sec.  Figures  7  and  8  compare  the  RSS  posi¬ 
tion  and  velocity  estimation  errors  for  the  filters  using  H  (1,  2,  3)  and 

A 

H  (3,  3,  3)  for  the  above  measurement  noise  variations.  Again,  we  find 
a 

that  the  robust  filter  H  (1,  2,  3)  has  a  considerably  smaller  mean  square 

A 

error  In  the  presence  of  outliers  than  the  more  conventional  way  of  handl¬ 
ing  outliers  Implemented  In  Ha(3,  3,  3). 


Figure  7 
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Figure  8 
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A  Monte  Carlo  evaluation  of  the  robust,  conditional  mean  smoother  was 
performed  using  the  same  simulated  trajectory  as  used  for  the  filter  eval¬ 
uation.  A  smoothing  Interval  of  one  sec.  or  20  samples  was  used.  The  Monte 
Carlo  sample  size  for  the  smoother  evaluation  Is  N  ■  10.  The  measurement 
noise  standard  deviation  was  Increased  so  that  ✓R1(K)  •  50  ft.  The  outlier 
contamination  Is  the  same  as  for  filter  evaluation.  I.e. ,  a  contamination 
of  8.8%  and  an  average  outlier  run  length  of  three.  Figures  9  and  10  com¬ 
pare  the  average  RSS  position  and  average  RSS  velocity  errors  for  the  two 
smoothers  with  H  (2,  3.  4)  and  H_ (4.  4,  4)  for  various  levels  of  outliers. 

a 

As  In  the  filtering  case  we  see  that  H#(2,  3.  4)  results  In  a  significant 

decrease  In  estimation  error  from  the  smoother  using  the  more  conventional 
H  (4.  4.  4)  method  of  handling  outliers. 

O 
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